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Model  Predictive  Control  for  Dynamic  Unreliable 

Resource  Allocation1 

David  A.  Castanon  2  and  Jerry  M.  Wohletz  3 


Abstract 

In  this  paper,  we  consider  a  class  of  unreliable  resource 
allocation  problems  where  resources  assigned  may  fail 
to  complete  a  task,  and  the  outcomes  of  past  resource 
allocations  are  observed  before  new  resource  allocations 
are  selected.  The  resulting  temporal  allocation  prob¬ 
lem  is  a  stochastic  control  problem,  with  a  state  space 
and  control  space  that  grow  exponentially  in  cardinal¬ 
ity  with  the  number  of  tasks.  We  introduce  an  approx¬ 
imation  by  enlarging  the  admissible  control  space,  and 
show  that  this  approximation  can  be  solved  exactly  and 
efficiently.  The  approximation  is  used  in  a  model  pre¬ 
dictive  control  (MPC)  algorithm.  For  single  resource 
problems,  the  MPC  algorithm  completes  over  98%  of 
the  task  value  completed  by  an  optimal  dynamic  pro¬ 
gramming  algorithm  in  over  1000  randomly  generated 
problems.  On  average,  it  achieves  99.5%  of  the  optimal 
performance  while  requiring  over  6  orders  of  magnitude 
less  computation. 


1  Introduction 

A  common  assumption  in  resource  allocation  problems 
such  as  multiprocessor  scheduling  or  job  shop  schedul¬ 
ing  is  that,  once  a  resource  works  on  a  task,  the  task  will 
be  completed  successfully.  However,  there  are  many  re¬ 
source  allocation  problems  where  resources  can  fail  to 
complete  the  task,  and  further  resource  assignments  are 
required  for  that  task.  We  refer  to  this  class  of  prob¬ 
lems  as  unreliable  resource  allocation  problems*.  Ex¬ 
amples  of  these  problems  include  assignment  of  search 
activity  (e.g.  sonobuoys)  to  sectors  [12],  assignment 
of  ground-air  missiles  to  aircraft,  or  more  general  as¬ 
signment  of  weapons  to  targets  [5]  in  diverse  military 
applications.  In  this  paper,  we  are  interested  in  the 
problem  of  dynamic  resource  allocation  where  resources 
are  non-renewable  and  unreliable,  and  where  the  suc¬ 
cess  of  past  resource  allocations  can  be  observed  before 
new  allocation  decisions  are  made.  These  problems  re¬ 

1This  research  was  supported  by  AFOSR  under  Grant  number 
F49620-01-1-0348,  and  by  DARPA  Information  Systems  Office 
and  AFRL/VACA  under  contract  number  F33615-01-C-3149. 
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quire  selecting  which  tasks  to  process  first,  and  what 
resources  to  hold  in  reserve  for  allocation  after  observ¬ 
ing  the  success  of  early  allocations. 

There  is  an  extensive  literature  on  many  variations  of 
weapon  target  assignment  problems  that  are  formu¬ 
lated  as  unreliable  resource  allocation  problems.  How¬ 
ever,  most  of  these  variations  consist  of  static  problems 

[5]  where  no  information  on  allocation  outcomes  is  ob¬ 
served.  Dynamic  variations  of  these  problems  where 
outcomes  of  allocations  were  observed  were  studied  in 

[6]  and  [2].  Recently,  Murphey  [8,  9]  has  addressed 
stochastic  dynamic  weapon  assignment  problems  where 
new  tasks  arrive  over  time,  and  weapon  assignments 
are  unreliable.  Murphey  allows  for  observation  of  the 
new  task  arrivals,  but  no  observation  of  the  past  al¬ 
location  outcomes.  The  resulting  problem  formulation 
is  a  stochastic  program  [10],  The  search  theory  litera¬ 
ture  has  extensive  results  on  dynamic  search  problems; 
a  good  overview  of  these  results  is  available  in  [12]. 
However,  the  results  focus  on  dynamic  search  focus  on 
sequential  search  for  a  single  object,  where  search  re¬ 
sources  are  allocated  to  a  single  site  at  a  time. 

We  focus  on  allocating  unreliable  resources  to  multiple 
tasks  over  two  stages,  where  the  outcomes  of  resources 
assigned  in  the  first  stage  are  observed  before  the  sec¬ 
ond  stage  allocations  are  selected.  We  pose  the  problem 
as  a  stochastic  control  problem;  although  this  problem 
can  be  solved  using  stochastic  dynamic  programming 
(SDP)  [1],  the  number  of  states  grows  exponentially 
with  the  number  of  tasks. 

afepriHim^fea^^  ^^pp^imte  SD^fop^Jilatipn 
that: canjt>e,  solved to 

e^com]f>are 

the  periformarice  of  the  MPC  controller  with  that  of  the 
optimal  SDP  algorithm  and  a  faster  suboptimal  SDP 
algorithm  using  random  problems.  Our  results  show 
that  the  model  predictive  algorithm  achieves  on  aver¬ 
age  over  99%  of  the  performance  of  the  optimal  SDP 
algorithm,  while  computing  allocations  for  nearly  1000 
tasks  in  under  1  second. 

The  rest  of  this  paper  is  organized  as  follows:  Sec- 


ASC  -02-297*/ 


1 


tion  2  describes  the  mathematical  formulation.  Sec¬ 
tion  3  discusses  the  exact  SDP  algorithm,  and  a  sim¬ 
pler  approximate  SDP  algorithm.  Section  4  describes 
the  fast  approximate  SDP  formulation  and  its  solution, 
and  MPC  approach.  Section  5  discusses  the  numerical 
experiments.  Due  to  space  limitations,  the  proofs  are 
omitted,  and  can  be  found  in  [4]. 


2  Problem  Formulation 

Assume  that  there  are  N  tasks,  indexed  by  i  = 
1, . . . ,  AT,  and  that  there  are  of  M  non-renewable  homo¬ 
geneous  resources  which  can  be  assigned  to  each  task 
over  two  possible  stages.  Associated  with  each  task  is 
a  value  Vi  which  is  obtained  by  completing  the  task. 
Use  of  a  resource  incurs  a  cost  C.  When  a  resource 
is  assigned  to  task  i  in  stage  ky  the  event  that  the  re¬ 
source  successfully  completes  the  task  has  probability 
Pi(k)j  and  this  event  is  independent  of  any  other  events 
generated  by  other  resource  assignments. 

Let  Xi(l)  denote  the  number  of  resources  assigned  to 
task  i  at  stage  1.  Under  the  independence  assumptions, 
the  probability  that  task  i  is  not  completed  is  given  by: 

ft(i,l)  =  (l-ft(l))**W  (1) 

At  the  completion  of  stage  1,  the  set  of  completed  tasks 
will  be  observed.  Let  =  {0, 1}*  denote  the  set  of 
possible  values  of  this  observation,  where  a/j  =  0  de¬ 
notes  that  task  i  was  completed  in  stage  1,  and  w*  =  1 
denotes  the  complementary  event  for  task  i .  Given  a 
vector  x(l),  eq.  (1)  induces  a  probability  distribution 
^fek(l))  on  the  possible  outcomes.  The  stage  2  allo¬ 
cations  are  strategies  x(2,u;)  that  depend  on  the  spe¬ 
cific  observed  outcome.  We  refer  to  these  strategies  as 
recourse  strategies. 

Given  resource  allocations  x(l)  and  recourse  strategies 
£(2)W),  the  probability  that  task  i  is  not  completed 
either  in  stage  1  or  in  stage  2  is  given  by 

Ps(i,2)  =  ^  •p(w|s(l))/(wi  =  1)(1 

^en 

(2) 

where  /(*)  is  the  indicator  function,  and 

^(w|s(l))  =  IJ  [1  -  (1  —  Pi(l))a:t(1)]  • 

{*|w<=0} 

n  (i-Pi(i))*'(i> 

01^=1} 

The  stochastic  control  problem  is  to  select  resource  al- 
locations  x(l)  and  recourse  strategies  x(2,u)  that  min¬ 
imize  the  expected  incomplete  task  value  plus  the  ex- 
pected  cost  of  using  resources: 

N  N 

Vips('>  2)  +  +  *i(  2,  w)}  (3) 

*=1  t=l 


subject  to  the  constraints 

N 

B  Xi(l)  -f  Xi(2,w))  <  M  for  all  w  G  Q  (4) 

i= 1 

Xi (1), Xi{ 2,  w)  €  {0, 1, ... , M}  for  all  u  G  ft, *  (5) 

The  above  problem  is  a  two-stage  stochastic  feedback 
control  problem  with  a  discrete  state  space  that  grows 
exponentially  in  the  number  of  tasks  N,  and  a  deci¬ 
sion  space  which  grows  exponentially  in  the  number  of 
tasks.  In  the  next  section,  we  discuss  the  solution  of 
this  problem  using  stochastic  dynamic  programming. 


3  Stochastic  Dynamic  Programming  Solution 

Consider  the  problem  at  the  second  stage,  after  the 
state  (v  has  been  observed.  Without  loss  of  general¬ 
ity,  assume  that  there  are  1V2  incomplete  tasks  in  a;, 
renumbered  from  j  —  1, . . . ,  Af2,  and  that  there  are  M2 
resources  remaining.  The  second  stage  problem  can  be 
expressed  as  follows: 

wSUi  f>i(1  “  Pir‘ + Cx>}  (6) 

subject  to 

n2 

X>i<  Xj  6  {0, 1, ,  M%}  (7) 

i=i 

Define  real- valued  functions  over  nonnegative  integer 
allocation  variables  n  G  {0, . . . ,  M2}  as 

fj(n)  =  Yj[(l  Pjr  )n]  +  Cn  (8) 

and  define  fj(x)yx  G  [0,  M2]  as  the  linear  interpola¬ 
tion  of  the  function  fj  (n).  Note  that  fj(x)  is  convex  in 
x  G  [0,  M2],  as  it  is  the  sum  of  two  convex  functions. 
Relaxing  the  second  stage  optimization  problem  to  al¬ 
low  for  real-valued  allocations  results  in  a  monotropic 
optimization  problem  [11]  of  the  form 

<9> 

subject  to 

^Xj<M2,  Xy>0  (10) 

i= 1 

The  piecewise  linear,  convex  nature  of  fj(x)y  together 
with  the  fact  that  M2  is  an  integer  and  all  the  points 
of  nondifferentiability  of  fj(x)  correspond  to  integer 
x,  guarantees  that  the  solution  to  eq.  (10)  is  an  in¬ 
teger  [11].  Furthermore,  the  separable  convex  nature 
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of  the  objective  function  and  the  single  additive  con¬ 
straint  leads  to  simple  computations  of  subgradients, 
and  guarantees  the  existence  of  a  scalar  Lagrange  mul¬ 
tiplier  which  satisfies  the  Karush-Kuhn-Tucker  condi¬ 
tions  [11].  The  result  is  a  fast  algorithm  for  determining 
the  optimal  recourse  allocations  x<i  (i,  w)  and  the  corre¬ 
sponding  optimal  value  J2  (w ,  M2 ) ,  which  is  equivalent 
to  an  incremental  line  search  for  the  optimal  Lagrange 
multiplier  value.  The  key  structural  result  is  stated 
below: 


Lemma  1  Consider  the  second  stage  allocation  prob¬ 
lem  defined  by  (9-10).  There  exists  a  nonnegative  value 
A*  such  that  the  optimal  resource  allocation  {xfi  is 
given  by 


x)  e  argminx.€[0M3] [fj(xj)  +  A *xj]  (11) 

Furthermore ,  A*  can  be  chosen  as  the  negative  of  one 
of  the  slopes  of  the  piecewise  linear  functions  fj  (z) . 


The  optimal  solution  has  an  incremental  optimality 
property :  The  optimal  solution  for  a  given  value  of  M2 
is  part  of  an  optimal  solution  for  any  value  M  >  M2. 
This  leads  to  efficient  algorithms  for  solving  the  sec¬ 
ond  stage  allocation  problem,  of  complexity  0(N  + 
Mln(iV  -h  M))  [5].  These  algorithms  are  used  for  each 
possible  first  stage  outcome  uj  and  remaining  resource 
level  M2  to  compute  the  optimal  cost-to-go  (w,  M2). 
The  optimal  cost-to-go  has  the  following  properties: 


Lemma  2  The  optimal  cost-to-go  function 
has  the  following  properties: 

L  J£  (u;,  M2)  is  a  convex ,  piecewise  linear,  nonin¬ 
creasing  function  of  M2  tvith  breakpoints  only  at 
integer  values  of  M2 . 

2.  Consider  two  distinct  outcomes  If 

fori  =  1, . . . ,N,  then  ^>^2)  < 

J2*(u;<2),M2). 


Unfortunately,  the  above  optimization  problem  is  a 
non-separable  integer  programming  problem,  and  the 
objective  function  has  2^  terms  in  the  summation.  Ex¬ 
act  solution  of  this  problem  is  a  difficult  combinatorial 
problem.  However,  the  presence  of  the  single  constraint 
(13)  suggests  the  use  of  an  incremental  optimization  ap¬ 
proach  similar  to  that  used  for  the  second  stage  prob¬ 
lem,  based  on  an  incremental  optimization  approach, 
as  follows:  Define  the  notation  xf  to  denote  the  vector 

(si  ...  Xi-i  Xi  +  l  Xi+i  ...  xn)  •  Let  J(x) 
be  defined  as 


N  N 

j(x) = (*4) 

u;  t=l  i= 1 


algpritlma  can  be  de- 


fvk  1.  Initialize  X{  =  0,  i  =  1, . . . ,  N. 

02.  For  each  i,  compute  MR{(x)  =  J(x)  -  J(xf). 

•  • ' 

03.  Select  i*  for  which  MR{*(x)  >  MRi(x)  for  all 
M  i  7^  i*. 

|||4.  If  MRi*  >  0  and  Xi  <  set  x **  = 

Sf  otherwise,  stop. 

Repeat  steps  2-4  until  algorithm  stops. 

Note  that  the  solution  to  eqs.  (12,13)  is  not  guaranteed 
to  have  the  incremental  optimality  property.  Thus, 
the  above  algorithm  is  only  an  approximate  algorithm, 
although  our  experimental  results  indicate  its  perfor¬ 
mance  is  indistinguishable  from  that  of  an  enumera- 
tive  search.  Note  also  that  computation  of  MRi{x) 
still  requires  summation  over  2N  terms,  an  exponential 
complexity  in  the  number  of  tasks.  In  the  next  section, 
we  describe  an  alternative  suboptimal  approach,  based 
on  using  Model  Predictive  Control  [7]  with  an  approxi¬ 
mate  optimization  model,  which  can  generate  solutions 
in  complexity  0((N  +  M)  In  IV). 


Consider  now  the  first  stage  problem.  The  solution  of 
eqs.  (3-5)  satisfies  the  stochastic  dynamic  program¬ 
ming  recursion 


N 


r  = 


x(i)e{o 


N 


min  V'  F(o;|x(l))  J2  fe,  M  -  ^i(l)) 


+C^2xi(l) 


(12) 


i~l 

subject  to  the  constraint 
N 


5 >(i)<M 


(13) 


*=1 


4  Model  Predictive  Control 


In  order  to  avoid  the  exponential  growth  in  complexity 
as  the  number  of  tasks  and  resources  grow,  we  pro¬ 
pose  an  alternative  algorithm  based  on  model  predic¬ 
tive  control  (MPC).^Ehe 


This  aggregate  model  is  based  on 
replacing  the  2^  constraints  in  eq.(4)  by  one  average 
resource  utilization  constraint.  This  ^j^U^nst^^n^ 
requires  that  the  average  number  of  resources  across 
all  sample  paths  cannot  exceed  the  available  number 
of  resources.  This  approach  is  similar  to  the  approach 
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used  in  [3,  13]  for  other  dynamic  resource  allocation 
pro  ems.  Mathematically,  the  new  constraint  is: 

N 

Eln-k  (!))(si(l)  +  Xi{2,w)  <  M  (15) 
•=I  u>6fl 


The  optimization  problem  used  in  the  MPC  approach 
is  to  mMmize  eq.  (3)  subject  to  constraints  in  eqs. 
(15,5).  The  solution  determines  x(l)  as  well  as  strate¬ 
gies  for  future  allocations.  Only  the  first  stage  allo¬ 
cations  are  implemented;  subsequently,  based  on  the 
observed  outcome  w,  the  second  stage  allocations  are 
determined  using  the  approach  discussed  previously. 
Note  that  every  strategy  which  was  feasible  for  the  con¬ 
straints  in  eq.(4)  satisfies  the  new  constraint  in  eq.(15) 
ihus,  the  relaxed  problem  overestimates  the  expected 

TiPrfnrm'moA  _ i.  1  .  ,  .  .  r 


uiuutu-proniem  m  > 

~’~r  A8  a  preliminary  step,  we  expand  the 
set  of  admissible  strategies  to  include  #X#^tegies,. 

.  1S’  mteodqce 'an-’addiiional:  rjmctojh .  vsttiabl^ 

^■uidepiejr^it  pf  other *aiidoin ^uHablesj 

m  afiawpble  decision  stogie*  ace 
pfthefcrm?(^),^(2,w,#).  Since  the  number  ofpure  " 
strategies  is  finite,  the  use  of  mixed  strategies  allows  for 
the  full  utilization  of  the  available  resources  in  eq.(15). 
Note  that,  in  the  original  stochastic  dyhamic  program¬ 
ming,  the  use  of  mixed  strategies  does  not  change  the 
optimal  cost,  as  there  exists  an  optimal  solution  which 
uses  only  pure  strategies.  However,  the  relaxed  prob- 
em  m  eqns.(3,15,5)  will  typically  have  a  better  cost 
when  mixed  strategies  are  allowed,  due  to  the  knap¬ 
sack  nature  of  the  integer  allocation  problem:  mixed 
strategies  will  allow  full  utilization  of  the  available  re¬ 
sources  in  the  constraint  of  eqn.(15). 


Let  ( Jk ,  Rk)  denote  the  expected  performance  and  re¬ 
source  utilization  of  pure  strategy  k;  mixed  strategies 
ow  us  to  achieve  any  performance  and  resource  uti¬ 
lization  ( J,R )  in  the  convex  hull  of  these  expected 
performance-resource  pairs.  We  define  local  strategies 
as  follows: 


Definition  1  A  local  strategy  consists  of  a  pure  strat- 
with  the  VroPerty  that  x<(2,w)  = 


mixed  strategy  using  only  local  strategies  that  achieves 
the  same  expected  performance  and  the  same  expected 
resource  use. 


The  result  is  based  on  the  property  that  the  objectives 
and  the  averaged  constraints  can  be  decomposed  ad- 
ditively  over  tasks.  This  leads  to  an  explicit  construc¬ 
tion  of  the  mixed  local  strategies  which  have  equivalent 
expected  performance  and  expected  resource  use  as  a 
given  pure  strategy. 

Let  k  denote  an  index  over  all  local  strategies,  and  let 
.  !  ^  )  ^eno^e  expected  performance  and  resource 
utilization  of  strategy  k .  The  optimal  mixed  strategy 
is  the  solution  of  the  linear  program 

(16) 

subject  to 

E^<M;  £*-1,  0  <  0*  <  1 

*  k 

This  linear  program  is  over  mixtures  of  all  local  strate¬ 
gies,  which  is  a  large  number.  The  next  results  provide 
a  better  characterization  of  the  optimal  strategy. 

Lemma  3  There  is  an  optimal  mixed  strategy  which  is 
a  mixture  of  at  most  two  local  strategies. 


Let  (J4fc ,  flf)  denote  the  expected  performance  and  re¬ 
source  allocation  for  task  i  under  local  strategy  k. 
Then, 

Ji  =  Ep("*l*<(1)){^r(«<  =  i)(i— p<(2))*‘(2-1*") 

Uli 

+Cxi(2,u>i)}  +  C'xj(l)  (17) 

Define  the  functions  Fi(T4)  as  the  solution  of  the  fol¬ 
lowing  single  task  resource  allocation  problem: 

Fi  (T)  =  min  J$Qk  (18) 

subject  to 

Y,&kRi  <Ti;  E>fc  =  1>  0  <  0*  <  1  (19) 

k  k 

Lemma  4  i'l(Tj)  are  piecewise  linear,  convex,  non¬ 
increasing  functions  ofTi. 


hus,  local  strategies  generate  recourse  allocations  for 
individual  tasks  based  on  the  observed  state  of  that 
task  only.  In  contrast,  general  strategies  use  recourse 
allocations  that  depend  on  the  combined  states  of  all 
the  tasks  a;. 

**  Consider  the  optimization  problem  in 
eqns'  (3>15>5).  Given  any  pure  strategy ,  there  is  a 


Lemma  5  The  comers  of  the  functions  F<(r<)  corre¬ 
spond  to  solutions  the  following  equation  for  nonnega¬ 
tive  values  of  A 

Furthermore ,  the  optimizing  solutions  #*(A),y*(A)  are 
monotone  nonincreasing  in  A. 
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Using  the  above  properties,  one  can  develop  a  fast  algo¬ 
rithm  for  computing  the  function  Fi(Ti)  in  complexity 
0(M  log  M)  for  each  as  described  in  [4]. 

With  the  above  notation,  we  can  rewrite  the  MPC 
problem  using  mixed  local  strategies  as  the  following 
hierarchical  problem 

l 

subject  to 

Y,Ti  <  M,  Ti  >0,i  =  l,...,JV  (22) 


Tasks 

Resources 

IA  Alg. 

MPC  Alg. 

Worst  MPC 

7 

7 

100% 

99.92% 

98.6% 

7 

9 

100% 

99.82% 

99.18% 

7 

11 

100% 

99.996% 

99.86% 

9 

7 

100% 

99.91% 

98.30% 

9 

9 

100% 

99.89% 

99.48% 

9 

11 

100% 

99.82% 

98.96% 

11 

7 

100% 

99.92% 

99.53% 

11 

9 

100% 

99.93% 

99.56% 

11 

11 

100% 

99.74% 

99.19% 

Table  1:  Performance  of  the  Model  Predictive  Control 
(MPC)  and  Incremental  Algorithm  (IA)  as  per¬ 
cent  of  value  completed  by  DP 


This  is  another  monotropic  programming  problem,  of 
the  type  discussed  earlier  in  the  second  stage  of  dy¬ 
namic  programming.  The  only  difference  is  that  the 
corner  points  of  F{(Ti)  do  not  occur  at  integer  values 
of  The  optimal  solution  is  obtained  by  the  same 
algorithm:  the  negative  of  slopes  of  Fi(Ti)  segments 
are  possible  values  of  the  Lagrange  multiplier  A  associ¬ 
ated  with  transitions  in  resource  allocations.  The  max¬ 
imum  number  of  possible  transition  values  of  A  is  2 M 
per  task,  for  a  total  less  than  or  equal  to  2 MN.  Per¬ 
forming  a  line  search  over  this  value  results  in  a  poly¬ 
nomial  time  algorithm  for  exact  solution  of  the  Model 
Predictive  Control  problem;  the  solution  will  have  the 
property  that  only  one  task  (corresponding  to  a  nega¬ 
tive  slope  equal  to  the  final  value  of  A  will  use  a  local 
mixed  strategy.  A  faster  algorithm  that  computes  the 
slopes  incrementally  for  each  £,  and  keeps  track  only  of 
the  next  slope  for  each  task,  can  be  shown  to  solve  the 
problem  in  complexity  0((M  -f  N)  log  N). 

Once  the  model  predictive  control  solution  is  deter¬ 
mined,  the  first  stage  allocations  are  assigned  to  each 
task.  The  only  ambiguity  occurs  when  the  local  mixed 
strategy  for  the  final  task  is  a  mixture  of  two  differ¬ 
ent  first  stage  allocations;  in  this  case,  we  allocate  the 
smaller  of  the  two  first  stage  allocations  to  that  task. 


5  Experimental  Results 

In  order  to  evaluate  the  effectiveness  of  the  proposed 
MPC  approach,  we  conducted  several  experiments  with 
the  following  algorithms: 

1.  The  exact  SDP  solution,  obtained  by  enumerat¬ 
ing  the  possible  first  stage  allocations  and  finding 
a  global  minimum. 

2.  The  Incremental  DP  (IA)  algorithm  discussed  at 
the  end  of  Section  3. 

3.  The  MPC  algorithm  described  in  Section  4. 


The  first  set  of  experiments  consisted  of  random  prob¬ 
lems  with  7  to  11  tasks,  with  task  values  selected  ran¬ 
domly  in  the  range  of  1-10,  and  task  success  probabil¬ 
ities  selected  randomly  in  the  interval  [0.7, 0.9].  The 
number  of  resources  for  each  number  of  task  varied 
from  7  to  11  resources.  For  each  number  of  tasks,  we 
generated  100  random  problems,  and  obtained  the  op¬ 
timal  solution  (in  terms  of  value  achieved)  by  SDP,  IA 
and  MPC  algorithms.  The  statistics  in  the  experiment 
report  the  percentage  of  the  value  achieved  by  the  op¬ 
timal  SDP  algorithm  averaged  over  the  100  problems. 
We  also  computed  the  worst  case  percentage  difference 
in  performance  between  the  MPC  algorithm  and  the 
SDP  algorithm.  The  results  are  summarized  in  Table 
I.  The  results  indicate  that  the  performance  of  IA  was 
optimal  for  all  random  problems  generated.  The  results 
also  show  that  the  MPC  algorithm  yields  near-optimal 
performance:  The  worst  case  performance  across  900 
problems  tested  was  within  2%  of  the  optimal  SDP 
performance,  and  the  average  performance  was  within 
0.3%  of  the  optimal  SDP  performance. 

The  second  set  of  experiments  used  problems  with  16 
and  20  tasks,  and  with  a  varying  number  of  resources 
from  12  to  20.  For  these  problems,  computing  the  ex¬ 
act  dynamic  programming  solution  using  enumerative 
techniques  was  prohibitively  long.  As  a  reference  point, 
it  required  3  days  on  a  LINUX  Pentium  1.7  GHz  work¬ 
station  to  solve  100  instances  of  the  11  task  problem. 
We  compared  results  only  for  I A  and  MPC  algorithms. 
The  statistics  reported  are  the  percentage  of  the  value 
achieved  by  the  IA  algorithm.  The  results  are  summa¬ 
rized  in  Table  II.  The  results  in  Table  II  confirm  the 
near  optimal  behavior  of  the  Model  Predictive  Control 
algorithm.  The  average  performance  is  within  0.2%  of 
the  performance  of  the  I A  algorithm,  and  the  worst 
case  performance  is  within  1%  of  the  performance  of 
the  IA  algorithm.  The  experiments  confirm  that  the 
MPC  algorithm’s  bias  to  commit  more  resources  in  the 
first  stage  has  a  nearly  negligible  impact  in  overall  task 
performance. 
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Tasks 

Resources 

Ave.  MPC 

Worst  MPC 

16 

12 

99.81% 

99.22% 

16 

16 

99.82% 

99.33% 

16 

20 

99.92% 

99.67% 

20 

12 

99.85% 

99.46% 

20 

16 

99.85% 

99.52% 

20 

20 

99.88% 

99.37% 

Table  2:  Performance  of  the  Model  Predictive  Con¬ 
trol  (MPC)  algorithm  as  percent  of  value  com¬ 
pleted  by  Incremental  Algorithm  (IA)  for  differ¬ 
ent  numbers  of  tasks. 


The  IA  algorithms  required  over  13  minutes  to  solve  a 
single  instance  of  a  20  task,  20  resource  problem  on  a 
Pentium  1.4  GHz  workstation  running  Linux.  In  con¬ 
trast,  the  MPC  algorithm  solved  100  instances  of  1000 
task,  1000  resource  problems  in  a  total  of  3.5  seconds. 
This  suggests  that  the  MPC  algorithm  is  well  suited  to 
applications  where  information  about  available  tasks 
and  values  becomes  available  in  real  time,  and  must  be 
converted  into  resource  allocation  decisions  quickly. 


6  Conclusion 

The  problem  of  allocation  of  unreliable  resources  to 
tasks  over  multiple  stages  arises  in  many  important  ap¬ 
plications.  In  this  paper,  we  have  developed  a  stochas¬ 
tic  dynamic  programming  formulation  for  this  problem, 
which  captures  the  opportunity  for  observing  task  com¬ 
pletion  events  and  using  recourse  strategies.  However, 
exact  solution  of  this  problem  using  Stochastic  Dy¬ 
namic  Programming  is  computationally  intensive  be¬ 
cause  both  the  state  space  and  the  admissible  action 
space  grow  exponentially  with  the  number  of  tasks.  As 
an  efficient  alternative,  we  developed  a  Model  Predic¬ 
tive  Control  algorithm  that  is  based  on  solving  a  re¬ 
laxed  stochastic  dynamic  programming  problem.  We 
established  that  the  relaxed  problem  can  be  solved 
very  fast,  in  time  nearly  linear  with  the  number  of 
tasks.  Furthermore,  the  resulting  algorithm  exhibits 
near-optimal  performance  across  a  range  of  random 
test  problems. 

There  are  several  important  directions  for  extension  of 
this  work  which  have  been  pursued  in  [4].  The  first 
of  these  is  extension  of  the  results  to  multiple  resource 
classes  and  multiple  stages.  The  main  theorem  in  this 
paper,  the  representation  of  the  optimal  relaxed  strate¬ 
gies  in  terms  of  local  strategies,  extends  in  a  straightfor¬ 
ward  manner  to  these  cases.  Another  interesting  exten¬ 
sion  is  to  consider  tasks  that  require  multiple  assign¬ 
ment  of  simultaneous  resources  to  complete.  Exten¬ 
sions  of  our  techniques  to  this  problem,  and  problems 
where  tasks  have  precedence  constraints,  are  currently 


under  investigation. 
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